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1 

DESCRIPTION 



MATERIALS AND METHODS FOR 
INCREASING CORN SEED WEIGHT 

This invention was made with government support under National Science Foundation 
grant number 93052818. The government has certain rights in this invention. 



Cross-Reference to a Related Application 
10 This application is a continuation-in-part of co-pending application Serial No. 08/299,675, 

filed September 1, 1994. 



Background pf the Inveqtjon 
ADP-glucose pyrophosphorylase (AGP) catalyzes the conversion of ATP and a-glucose-1- 

15 phosphate to ADP-glucose and pyrophosphate. ADP-glucose is used as a glycosyl donor in starch 

biosynthesis by plants and in glycogen biosynthesis by bacteria. The importance of ADP-glucose 
pyrophosphorylase as a key enzyme in the regulation of starch biosynthesis was noted in the study 
of starch deficient mutants of maize (Zea mays) endosperm (Tfcai and Nelson, 1966; Dickinson 
and Preiss, 1969). AGP enzymes have been isolated from both bacteria and plants. 

20 Bacterial AGP consists of a homotetramer, while plant AGP from photosynthetic and non- 

photosynthetic tissues is a heterotetramer composed of two different subunits. The plant enzyme 
is encoded by two different genes, with one subunit being larger than the other. This feature has 
been noted in a number of plants. The AGP subunits in spinach leaf have molecular weights of 
54 kDa and 51 kDa, as estimated by SDS-PAGR Both subunits are immunoreactive with 

25 antibody raised against purified AGP from spinach leaves (Copeland and Preiss, 1981; Morell et 
aL, 1987). Immunological analysis using antiserum prepared against the small and large subunits 
of spinach leaf showed that potato tuber AGP is also encoded by two genes (Okita et <z/. v 1990). 
The cDNA clones of the two subunits of potato tuber (50 and 51 kDa) have also been isolated 
and sequenced (Muller-Rober et al^ 1990; Nakata et al^ 1991). 

30 As Hannah and Nelson (Hannah and Nelson, 1975 and 1976) postulated, both Shrunken-2 

(Sh2) (Bhave et al n 1990) and Brittle-2 (Bt2) (Bae et aL, 1990) are structural genes of maize 
endosperm ADP-glucose pyrophosphorylase. Sh2 and Bt2 encode the large subunit and small 
subunit of the enzyme> respectively. From cDNA sequencing, Sh2 and Bt2 proteins have predicted 
molecular weight of 57,179 Da (Shaw and Hannah, 1992) and 52,224 Da, respectively. The 

35 endosperm is the site of most starch deposition during kernel development in maize. Sh2 and bt2 

maize endosperm mutants have greatly reduced starch levels corresponding to deficient levels of 
AGP activity. Mutations of either gene have been shown to reduce AGP activity by about 95% 
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(Tsai and Nelson, 1966; Dickinson and Preiss, 1969). Furth more, it has been observed that 
enzymatic activities increase with the dosage of functional wild type Sh2 and Bt2 alleles, whereas 
mutant enzymes have altered kinetic properties. AGP is the rate limiting step in starch 
biosynthesis in plants. Stark et aL placed a mutant form of £. coli AGP in potato tuber and 

5 obtained a 35% increase in starch content (Stark, 1992). 

The cloning and characterization of the genes encoding the AGP enzyme subunits have 
been reported for various plants. These include SH2 cDNA (Bhave et al n 1990), Sh2 genomic 
DNA (Shaw and Hannah, 1992), and Bt2 cDNA (Bae et aL, 1990) from maize; small subunit 
cDNA (Anderson et aL, 1989) and genomic DNA (Anderson et al., 1991) from rice; and small and 

10 large subunit cDNAs from spinach leaf (Morell et a/ M 1987) and potato tuber (Muller-Rober et 

al., 1990; Nakata et al., 1991). In addition, cDNA clones have been isolated from wheat 
endosperm and leaf tissue (Olive et al., 1989) and Arabidopsis thaliana leaf (Lin et al., 1988). 

AGP functions as an allosteric enzyme in all tissues and organisms investigated to date. 
The aUosteric properties of AGP were first shown to be important in R cotL A glycogen- 

15 overproducing £ coli mutant was isolated and the mutation mapped to the structural gene for 

AGP, designated as gr>G The mutant R coli, known as gr>C-16, was shown to be more sensitive 
to the activator, fructose 1,6 bisphosphate, and less sensitive to the inhibitor, cAMP (Preiss, 1984). 
Although plant AGP # s are also aUosteric, they respond to different effector molecules than 
bacterial AGP's. In plants, 3-phosphoglyceric acid (3-PGA) functions as an activator while 

20 phosphate (P0 4 ) serves as an inhibitor (Dickinson and Preiss, 1969). 

In view of the fact that endosperm starch content comprises approximately 70% of the 
dry weight of the seed, alterations in starch biosynthesis correlate with seed weight. 
Unfortunately, the undesirable effect associated with such alterations has been an increase in the 
relative starch content of the seed. Therefore, the development of a method for increasing seed 

25 weight in plants without increasing the relative starch content of the seed is an object of the 

subject invention. 

Brief Summary of the Invention 
The subject invention concerns a novel variant of the Shrunken -2 (Sh2) gene from maize. 
30 The Shi gene encodes ADP-glucose pyrophosphorylase (AGP), an important enzyme involved in 

starch synthesis in the major part of the corn seed, the endosperm. In a preferred embodiment, 
the novel gene of the subject invention encodes a variant AGP protein which has two additional 
amino acids inserted into the sequence. The variant gene described herein has been termed the 
Sh2-mlRev6 gene. Surprisingly, the presence of the Sh2-mlRev6 gene in a com plant results in 
35 a substantial increase in corn seed weight when compared to wild type seed weight, but does so 

in the absence of an increase in the relative starch content of the kernel. 
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Th subject invention further concerns a meth d of using the variant sh2 gene in maize 
to increase seed weight The subject invention also concerns plants having the variant sh2 gene 
and expressing the mutant protein in the seed endosperm. 

As described herein, the shl variant, Sh2-mlRev6, can be produced using in vivo, site- 
5 specific mutagenesis. A txansposable element was used to create a series of mutations in the 
sequence of the gene that encodes the enzyme. As a result, the Sh2-mlRev6 gene encodes an 
additional amino acid pair within or close to the allosteric binding site of the protein. 

Brief Description of the Sequences 
10 SE q id no. 1 is the genomic nucleotide sequence of the Sh2-mlRev6 gene. 

SEQ ID NO. 2 is the nucleotide sequence of the Sh2-mlRev6 cDNA. 

SEQ m NO. 3 is the amino arid sequence of the protein encoded by nucleotides 87 

through 1640 of SEQ ID NO. Z 

SEQ D> NO. 4 is a nucleotide sequence encoding the amino arid sequence shown in SEQ 

15 ID NO. 5. 

SEQ ID NO. 5 is the amino arid sequence of an ADP-glucose pyrophosphorylase (AGP) 
enzyme subunit containing a single serine insertion. 

Detailed Disclosure of fte Invention 

20 The subject invention provides novel variants of the Shrunken^ (Sh2) gene and a method 

for increasing seed weight in a plant through the expression of the variant sh2 gene. The Sh2 
gene encodes a subunit of the enzyme ADP-glucose pyrophosphorylase (AGP) in maize 
endosperm. One variant gene, denoted herein as Sh2«nlRev6, contains an insertion mutation that 
encodes an additional tyrosineiserine or serinertyrasine amino acid pair that is not present in the 

25 wild type protein. The sequences of the wild type DNA and protein are disclosed in Shaw and 

Hannah, 1992. The in vivo, site-specific mutation which resulted in the tyTOsineserine or 
serineityrosine insertion, was generated in Sh2 using the transposable element, dissociation (Ds) t 
which can insert into, and be erased from, the Sh2 gene under appropriate conditions. Ds 
excision can alter gene expression through the addition of nucleotides to a gene at the site of 

30 excision of the element 

In a preferred embodiment, insertion mutations in the Sh2 gene were obtained by 
screening for germinal revenants after excision of the Ds transposon from the gene. The 
revertants were generated by self-pollination of a stock containing the DsShl mutant allele, the 
Activator {Ac) element of this transposable element system, and appropriate outside markers. The 

35 Ds element can transpose when the Ac element is present Wild type seed were selected, planted, 
self-pollinated and crossed onto a tester stock. Results from this test cross were used to remove 
wild type alleles due to pollen contamination. Seeds homozygous for each revertant allele were 
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obtained from the self-progeny. Forty-four germinal revenants of the Dj-induced shl mutant were 
collected. 

Cloning and sequencing of the Ds insertion site showed that the nucleotide insertion 
resides in the area of the gene that encodes the binding site for the AGP activator, 3-PGA 

5 (Morrell, 1988). Of the 44 germinal revenants obtained, 28 were sequenced. The sequenced 

revenants defined 5 isoalleles of sh2: 13 restored the wild type sequence, 11 resulted in the 
insertion of the amino acid tyrosine, two contained an additional serine (inserted between amino 
acid residues 494 and 495, respectively, of the native protein sequence), one revertant contained 
a two amino add insertion, tyrosine:tyrosine, and the last one, designated as Sh2-mlRev6> 

10 contained the two amino acid insertion, tyrosine:serine or serinertyrosine. The Sh2-mlRev6 

variant encodes an AGP enzyme subunit that has either the serine:tyrosine amino acid pair 
inserted between the glycine and tyrosine at amino acid residues 494 and 495, respectively, of the 
native protein, or the serine:tyrosine amino add pair inserted between the two tyrosine residues 
located at position 495 and 496 of the native protein sequence. Due to the sequence of the amino 

15 adds in the area of the insertions, the Sh2-mlRev6 variant amino add sequences encoded by each 

of these insertions are identical to each other. 

Surprisingly, the expression of the Sh2-m2Rev6 gene in maize resulted in a significant 
increase in seed weight over that obtained from maize expressing the wild-type Sh2 allele. 
Moreover, seeds from plants having the Sh2-mlRev6 gene contained approximately the same 

20 percentage starch content relative to any of the other revenants generated. In a preferred 
embodiment, the Sh2-mlRev6 gene is contained in homozygous form within the genome of a plant 
seed. 

The subject invention further concerns a plant that has the Sh2-mJRcv6 gene incorporated 
into its genome. Other alleles disclosed herein can also be incorporated into a plant genome. 
25 In a preferred embodiment, the plant is a monocotyledonous spedes. More preferably, the plant 

may be Zta mays. Plants having the Sh2-mlRev6 gene can be grown from seeds that have the 
gene in their genome. In addition, techniques for transforming plants with a gene are known in 
the art 

Because of the degeneracy of the genetic code, a variety of different polynucleotide 
30 sequences can encode the variant AGP polypeptide disclosed herein. In addition, it is well within 

the skill of a person trained in the art to create alternative potynudeoiide sequences encoding the 
same, or essentially the same, polypeptide of the subject invention. These variant or alternative 
porynudeotide sequences are within the scope of the subject invention. As used herein, references 
to "essentially the same" sequence refers to sequences which encode amino add substitutions, 
35 deletions, additions, or insertions which do not materially alter the functional activity of the 

polypeptide encoded by Sh2-mlRev6 or the other alleles. The subject invention also contemplates 
those polynudeotide molecules having sequences which are sufficiently homologous with th wild 
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type Sh2 DNA sequence so as to pennit hybridization with that sequence under standard high- 
stringency conditions. Such hybridization conditions ar conventional in the art (see, eg., 
Maniatis a al, 1989). 

The polynucleotide molecules of the subject invention can be used to transform plants 
5 to express the Sh2-mlRev6 allele, or other alleles of the subject invention, in those plants. In 

addition, the polynucleotides of the subject invention can be used to express the recombinant 
variant AGP enzyme. They can also be used as a probe to detect related enzymes. The 
polynucleotides can also be used as DNA sizing standards. 

The polypeptides encoded by the polynucleotides of the subject invention can be used to 
10 catalyze the conversion of ATP and a-glucose-l-phosphate to ADP-glucose and pyrophosphate, 
or to raise an immunogenic response to the AGP enzymes and variants thereof. They can also 
be used as molecular weight standards, or as an inert protein in an assay. 

The following are examples which illustrate procedures and processes, including the best 
15 mode, for practicing the invention. These examples should not be construed as limiting, and are 

not intended to be a delineation of all possible modifications to the technique. All percentages 
are by weight and aU solvent mixture proportions are by volume unless otherwise noted. 



20 Pmmple 1 - E Tpression of Sh2-mlRev6 Gene in Maize Endosperm. 

Homozygous plants of each revertant obtained after excision of the Ur transposon were 
crossed onto the Fl hybrid corn. "Florida Stay Sweet." This sweet corn contains a null allele for 
the Shi gene, termed sA2-R. Resulting endosperms contained one dose of the functional allele 
from a revertant and two female-derived null alleles, denoted by the following genotype Shi- 

25 m lRevXlsh2-Rlsh2-R, where X represents one of the various isoalleles of the revertants. Crosses 

were made during two growing seasons. 

Resulting seed weight data for each revertant and wild type seed are shown in Table 1. 
The first column shows the amino acid insertion in the AGP enzyme obtained after the in vivo, 
site-specific mutagenesis. 



30 
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Sequence 
alteration 


# of revertants 


Average Seed weight 


Standard deviati n 


wild type 


13 


0.250 grams 


0.015 


tyrosine 


11 


0.238 grams 


0.025 


serine 


2 


0.261 grams 


0.014 


tyr,tyr 


1 


0.223 grams 


nd* 


tyr.ser 
mev6) 


1 


0.289 grams 


0.022 



10 *nd - not determined 

The data shown in Table 1 represents the average kernel seed weight for each revertant 
over the course of two growing seasons. The expression of the Sh2-mlRev6 gene to produce the 
Rev6 mutant AGP subunit gave rise to an almost 16% increase in seed weight in comparison to 
15 the wild type revertant. The revertants having the single serine insertion also showed an increase 

in average seed weight over wild type seed weight. 

In addition, starch content was determined on the kernels analyzed above using various 
methodologies. The analysis showed that Sh2-mlRev6 containing kernels were no higher in 
percentage starch relative to kernels expressing the other alleles shown in the table above. 
20 Therefore, it appears that the increase in seed weight is not solely a function of starch content 

Corn seeds that contain at least one functional Sh2-mlRev6 allele will be deposited with 
the American Type Culture Collection (ATCC), 12301 Parklawn Drive, Rockville, Maryland 20852 
USA. The culture will be assigned an accession number by the repository. 

The subject culture will be deposited under conditions that assure that access to the 
25 culture will be available during the pendency of this patent application to one determined by the 

Commissioner of Patents and Trademarks to be entitled thereto under 37 CFR 1.14 and 35 U.S.C 
122. The deposit will be available as required by foreign patent laws in countries wherein 
counterparts of the subject application, or its progeny, are filed. However, it should be 
understood that the availability of a deposit does not constitute a license io practice the subject 
30 invention in derogation of patent rights granted by governmental action. 

Further, the subject culture deposit will be stored and made available to the public in 
accord with the provisions of the Budapest Treaty for the Deposit of Microorganisms, £c, it will 
be stored with aU the care necessary to keep it viable and uncontaminated for a period of at least 
five years after the most recent request for the furnishing of a sample of the deposit, and in any 
35 case, for a period of at least thirty (30) years after the date of deposit or for the enforceable fife 

of any patent which may issue disclosing the culture. The depositor acknowledges the duty to 
replace the deposit should the depository be unable to furnish a sample when requested, due to 



PCT/US96/13585 

WO 98/07834 



the condition of the deposit AU restrictions on the availability to the public of the subject 
culture deposit will be irrevocably removed upon the granting of a patent disclosing it 

As would be apparent to a person of ordinary skill in the art, seeds and plants that are 
homozygous for the Sh2-mlRev6 aUele can be readily prepared from heterozygous seeds using 
techniques that are standard in the art In addition, the Sh2-mlRev6 gene can be readily obtained 

from the deposited seeds. 

The skilled artisan, using standard techniques known in the art, can also prepare 
polynucleotide molecules that encode additional amino acid residues, such as serine, at the 
location of the insertions in the subject revertants. Such polynucleotide molecules are included 
within the scope of the subject invention. 

It should be understood that the examples and embodiments described herein are for 
illustrative purposes only and that various modifications or changes in fight thereof will be 
suggested to persons skilled in the art and are to be included within the scope and purview of this 
" application and the scope of the appended claims. 
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60 
120 
180 
240 
300 
360 
420 
480 
540 
600 



780 
840 
900 
960 



TAAGAGGGGT GCACCTACCA TAGATTTTTT GGGCTCCCTG GCCTCTCCTT TCTTCCGCCT 
QAAAACAACC TACATGGATA CATCTQCAAC CAGAGGGAGT ATCT0AT6CT TTTTCCTGGG 
CAGGGAGAGC TATGAGACGT ATGTCCTCAA AGCCACTTTG CATTGTGTGA AACCAATATC 
OATCTTTGTT ACTTCATCAT GCATGAACAT TTGTGGAAAC TACTAGCTTA CAAGCATTAG 
TGACAGCTCA GAAAAAAGTT ATCTCTGAAA GGTTTCATGT GTACCGTGGG AAATGAGAAA 
TGTTGCCAAC TCAAACACCT TCAATATGTT GTTTGCAGGC AAACTCTTCT CCAAGAAAGG 
TGTCTAAAAC TATGAACGGG TTACAGAAAG OTATAAACCA CGGCTGTGCA TTTTGGAAGT 
ATCATCTATA GATGTCTGTT GAGGGGAAAG CCOTACGCCA ACGTTATTTA CTCAGAAACA 
GCTTCAACAC ACAGTTGTCT GCTTTATGAT GGCATCTCCA CCCAGGCACC CACCATCACC 
TATTCACCTA TCTCTCGTGC CTGTTTATTT TCTTGCCCTT TCTGATCATA AAAAATCATT 
AAGAGTTTGC AAACATGCAT AGGCATATCA ATATGCTCAT TTATTAATTT GCTAGCAGAT 660 
CATCTTCCTA CTCTTTACTT TATTTATTGT TTGAAAAATA TGTCCTGCAC CTAGGGAGCT 720 
CGTATACAGT ACCAATGCAT CTTCATTAAA TGTGAATTTC AGAAAGGAAG TAGGAACCTA 
TGAGAGTATT TTTCAAAATT AATTAGCOGC TTCTATTATG TTTATAGCAA AGGCCAAGGG 
CAAAATCGGA ACACTAATGA TGOTTOGTTQ CATGAGTCTG TOGATTACTT GCAAGAAATG 
TGAACCTTTG TTTCTOTOCG TGGGCATAAA ACAAACAGCT TCTAGCCTCT TTTACGGTAC 
TTGCACTTGC AAGAAATGTG AACTCCTTTT CATTTCTGTA TGTGGACATA ATGCCAAAGC 1020 
ATCCAGGCTT TTTCATGGTT GTTGATGTCT TTACACAGTT CATCTCCACC AGTATGCCCT 1080 
CCTCATACTC TATATAAACA CATCAACAGC ATCGCAATTA GCCACAAGAT CACTTCGGGA 1140 
CGCAAGTGTG ATTTCGACCT TGCAGCCACC TTTTTTTGTT CTGTTGTAAG TATACTTTCC 1200 
CTTACCATCT TTATCTGTTA GTTTAATTTO TAATTGGGAA GTATTAGTGG AAAGAGGATG 1260 
AGATGCTATC ATCTATGTAC TCTGCAAATG CATCTGACGT TATATGGGCT GCTTCATATA 1320 
ATTTGAATTG CTCCATTCTT GCCGACAATA TATTGCAAGG TATATGCCTA GTTCCATCAA 1380 
AAGTTCTOTT TTTTCATTCT AAAAGCATTT TAGTGGCACG CAATTTTGTC CATGAGGGAA 1440 
AGGAAATCTC TTTTGGTTAC TTTGCTTGAG GTGCATTCTT CATATGTCCA GTTTTATGGA 1500 
AGTAATAAAC TTCAOTTTGG TCATAAGATG TCATATTAAA GGGCAAACAT ATATTCAATG 1560 
TTCAATTCAT CGTAAATGTT CCCTTTTTGT AAAAOATTGC ATACTCATTT ATTTGAGTTG 1620 
CAGGTGTATC TAGTAGTTGG AGGAGATATG CAGTTTGCAC TTGCATTGGA CACGAACTCA 1680 
GGTCCTCACC AGATAAGATC TTGTGAGGGT GATGGCATTG ACAGGTTGGA AAAATTAAGT 1740 
ATTGGGGGCA GAAAGCAGGA GAAAGCTTTG AGAAATAGGT GCTTTGGTGG TAGAGTTGCT 1800 
GCAACTACAC AATGTATTCT TACCTCAGAT GCTTGTCCTG AAACTCTTCT AAGTATCCAC I860 
CTCAATTATT ACTCTTACAT GTTGGTTTAC TTTACGTTTG TCTTTTCAAG GGAAATTTAC 1920 
TGTATTTTTT GTGTTTTGTG GGAGTTCTAT ACTTCTGTTG GACTGGTTAT TGTAAAGATT 1980 
TGTTCAAATA GGGTCATCTA ATAATTGTTT GAAATCTGGG AACTGTGGTT TCACTGCGTT 2040 
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CAGGAAAAAG 
GCTGCATGAT 
TTACATATAT 
CTCACCAGTG 
TTATCTAGGA 
CCTCTAOOAA 
CTGOATCTCA 
CT6AACATCC 
ATCTTAGTTT 
ATAATCGAAG 
GTCAAATTTT 
TTTTTTAATT 
TGATATCCCT 
GTTCAATTCT 
CTTTGCTGAT 
TAGGAGTAGA 
TAGATCCAAA 
TGCCTTAATG 
TAATTGTTCA 
TTCCAGGGTA 
TTCTCGTTTA 
TTCTCCTGTA 
GAGTGGCGAT 
TGTTCCTCAT 
TTATTTCCAG 
TGAGAGGTAA 
GAGGATGGAT 
AAATGGGCTA 
GGGTGCTGAT 
CTTTCTTTCT 
TATTCAGAAA 
GTTGCAGAGA 
ATACCTTGCA 
GTAATCACTT 



TGAATTATTG 
TATCACAAAT 
AACTGCAACT 
TAATCTTTCC 
TTAAGTAATC 
AAATTATGCT 
GCTCTTTCCT 
AACGTTGATT 
TCTACAATAT 
TGGTATGTAA 
CTTAGAAGTT 
TTTTAATTGG 
ATGAGTAACT 
ACTTOGCTTA 
GGATCTGTAC 
TTTGTGTGGA 
GGCATTGTGG 
TTCCATTGAA 
TCCTGCAGGT 
CAGCAGACTC 
TGAATGTCCA 
TTTCTTTAGG 
CAGCTTTATC 
GTTTCAOGTA 
AAACATGTCG 
TCAGTTGTTT 
ATGGTCTAAT 
GTGAAGATTG 
TTGAATTCTA 
TGAGATGAAC 
TGATTTTCTA 
GTTGAGACCA 
TCAATGGGCA 
TCCTGTGACT 



GTTACTGCAT 
CATTGCTACG 
CCTAGTTGCG 
TGAATTGTTA 
TAACTCTAGG 
GATGCAAACC 
CTGACAAGCA 
ACTCTATTAT 
TTAGTGGATT 
GACAGTGAGT 
TTTTTGGTCC 
TCCACTATTA 
GCTTCAACAG 
ACCGCCATAT 
AGGTGATTTA 
GAGAATAATA 
TTCAAAACAC 
TGGGGCAAAT 
ATTAGCGGCT 
TATCAGAAAA 
TTCACTCATT 
ATTATTACAG 
GGATGAATTA 
ATGTCCTGAT 
AGGAOGATGG 
ATATCATCCT 
CTGCTTTCCT 
ATCATACTGG 
TGGTTAGAAA 
CCCTCTTTTA 
TTTTGCTGTA 
ACTTCCTGAG 
TTTATGTCTT 
TATTTCTATC 
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GAATAACTTA 
ATATCTTATA 
TTCAAAAAAA 
TTTAATGGCA 
CCCCATATTT 
GTGTATCTGC 
CAAGAGCTAC 
AGTATTATAC 
CTTCTCATTT 
TAAAAGATTA 
AGATGTTCAT 
GGTACCTGTT 
TGGTATAAAT 
TCATCGTACA 
CCTGATCTTG 
AACAGATGCC 
TATGGACTTC 
TATTGATTCT 
ACACAAATGC 
TTTATCTGGG 
CCTGTAGCAT 
TCACAAATCC 
CATGGAACTT 
TTTGGATTAA 
TGATATCACT 
AATATGAATA 
TTTTTTTCCC 
AOGTGTACTT 
TTCCTTGTGT 
GTTATTTCCA 
GAATCTGACA 
CTATGCTATA 
CAAGAAAGAT 
CAACTCCTAG 



TGGAAATAGA 
ATAGTTCTTT 
AAAATGCAAC 
TGTATGCACT 
GCAGCATTCT 
TATCATTTTG 
GCCTGCTGTA 
AGACTGTACT 
TGAAGATACA 
TATTTTTTGG 
AAAGTOGCCG 
GGAGGATGTT 
AAGATATTTG 
TACCTTGAAG 
TTGATGTGTA 
GAGATTCTTT 
TACCATTTAT 
ACAAGTGTTT 
CTGAAGAGCC 
TACTCGAGGT 
TCTTTCTTTG 
ATTGACAACA 
GTGCAGGTAT 
CCAACTACTT 
ATATCATGTG 
TGTCATCTTG 
TTCGGAAGCC 
CAATTCTTTG 
AATCCAATTC 
TGGATAACCT 
CTAAAGCTAA 
GATGATGCAC 
GCACTTTTAG 
TTTACCTTCT 



CCTTAGAGTT 
CGACCTCGCA 
TCTTAGAACG 
ACTTGTATAC 
CAAACACAGT 
GGCGGAGGCA 
AGGGATAACA 
TTTCGAATTT 
CAATTGATCC 
GAGACTTCCA 
CTTTCATACT 
ACAGGCTTAT 
TGATGAGTCA 
GCGGGATCAA 
ATACTGTAAT 
TCTAAAAGTC 
GTCATTACTT 
AATTAAAAAC 
AGCTGGATGG 
AGTTGATATT 
TAATTTTGAG 
TTGTAATCTT 
GGTGTTCTCT 
TTGGCATGCA 
CTCCTGTTGA 
TTATCCAACA 
GAGCTTCTAA 
AAAAACGAAA 
TTTTGTTTTC 
GTACTTGACT 
TAGCACTGAT 
AGAAATATCC 
ACCTTCTCAA 
AACAGTGTCA 



2100 

2160 

2220 

2280 

2340 

2400 

2460 

2520 

2580 

2640 

2700 

2760 

2820 

2880 

2940 

3000 

3060 

3120 

3180 

3240 

3300 

3360 

3420 

3480 

3540 

3600 

3660 

3720 

3780 

3840 

3900 

3960 

4020 

4080 
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ATTCTTAGGT CAAAATATAC TCAATTACAT GACTTTGGAT CTGAAATCCT CCCAAGAGCT 4140 

GTACTAGATC ATAGTGTGCA GGTAAGTCTG ATCTGTCTGG AGTATGTGTT CTGTAAACTG 4200 

TAAATTCTTC ATGTCAAAAA GTTGTTTTTG TTTCCAGTTT CCACTACCAA TGCACGATTT 4260 

ATGTATTTTC GCTTCCATGC ATCATACATA CTAACAATAC ATTTTACGTA TTGTGTTAGG 4320 

CATGCATTTT TACGGGCTAT TGGGAGGATG TTGGAACAAT CAAATCATTC TTTGATGCAA 4380 

ACTTGGCCCT CACTGAGCAG GTACTCTGTC ATGTATTCTG TACTGCATAT ATATTACCTG 4440 

GAATTCAATG CATAGAATGT GTTAGACCAT CTTAGTTCCA TCCTGTTTTC TTCAATTAGC 4500 

TTATCATTTA ATAGTTGTTG GCTAGAATTT AAACACAAAT TTACCTAATA TGTTTCTCTC 4560 

TTCAGCCTTC CAAGTTTGAT TTTTACGATC CAAAAACACC TTTCTTCACT GCACCCCGAT 4620 

GCTTGCCTCC GACGCAATTG GACAAGTGCA AGGTATATGT CTTACTGAGC ACAATTGTTA 46B0 

CCTGAGCAAG ATPTTGTGTA CTTGACTTGT TCTCCTCCAC AGATGAAATA TGCATTTATC 4740 

TCAGATGGTT GCTTACTGAG AGAATGCAAC ATCGAGCATT CTGTGATTGG AGTCTGCTCA 4800 

CGTGTCAGCT CTGGATGTGA ACTCAAGGTA CATACTCTCC CAATGTATCT ACTCTTGAGT 4860 

ATACCATTTC AACACCAAGC ATCACCAAAT CACACAGAAC AATAGCAACA AAGCCTTTTA 4920 

GTTCCAAGCA ATTTAGGGTA GCCTAGAGTT GAAATCTAAC AAAACAAAAG TCAAAGCTCT 4980 

ATCACGTGGA TAOTTGTTTT CCATGCACTC TTATTTAAGC TAATTTTTTG GGTATACTAC 5040 

ATCCATTTAA TTATTGTTTT ATTGCTTCTT CCCTTTGCCT TTCCCCCATT ACTATCGCGT 5100 

CTTAAGATCA TACTACGCAC TAOTGTCTTT AGAGGTCTCT GGTGGACATG TTCAAACCAT 5160 

CTCAATCGGT GTTGGACAAG TT T TTC T TO A ATTTGTGCTA CACCTAACCT ATCACGTATG 5220 

TCATCGTTTC AAACTCGATC CTTCCTGTAT CATCATAAAT CCAATGCAAC ATACGCATTT 5280 

ATGCAACATT TATCTGTTGA ACATOTCATC TTTTTOTAGG TTAACATTAT GCACCATACA 5340 

ATGTAGCATG TCTAATCATC ATCCTATAAA ATTTACATTT TAGCTTATGT GGTATCCTCT 5400 

TGCCACTTAG AACACCATAT GCTTGATGCC ATTTCATCCA CCCTGCTTTO ATTCTATGGC 5460 

TAACATCTTC ATTAATATCC TCGCCTCTCT GTATCATTGG TCCTAAATAT GGAAATACAT 5520 

TCTTTCTGGG CACTACTTGA CCTTCCAAAC TAACOTCTCC TTTGCTCCTT TCTTGTGTGT 5580 

AGTAGTACCG AAGTCACATC TCATATATTC GGTTTTAGTT CTACTAAGTC CCGGGTTCGA 5640 

TCCCCCTCAG GGGTGAATTT CGGGCTTGGT AAAAAAAATC CCCTCGCTGT GTCCCGCCCG 5700 

CTCTCGGGGA TCGATATCCT GCGCGCCACC CTCCGGCTGG GCATTGCAGA GTGAGCAGTT 5760 

GATCGGCTCG TTAGTGATGG GGAGCGGGGT TCAAGGGTTT TCTCGGCCGG GACCATGTTT 5820 

CGGTCTCTTA ATATAATGCC GGGAGGGCAG TCTTTCCCTC CCCGGTCGAG TTTTAGTTCT 5880 

ACCGAGTCTA AAACCTTTGG ACTCTAGAGT CCCCTGTCAC AACTCACAAC TCTAGTTTTC 5940 

TATTTACTTC TACCTAGCGT TTATTAATGA TCACTATATC GTCTGTAAAA AGCATACACC 6000 

AATGTAATCC CCTTGTATGT CCCTTGTAAT ATTATCCATC ACAAGAAAAA AAGGTAAGGC 6060 

TCAAAGTTGA CTTTTGATAT AGTCCTATTC TAATCGAGAA GTCATCTGTA TCTTCGTCTC 6120 
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TTGTTCOAAC ACTAGTCACA AAATTTTTTG TACATGTTCT TAATGAGTCC AACGTAATAT 6180 

TCCTTGATAT TTTGTCATAA GCCCTCATCA AGTCAATGAA AATCACGTGT AGGTCCTTCA 6240 

TTTGTTCCTT ATACTGCTCC ATCACTTGTC TCATTAAGAA AATCTCTCTC ATAGTTAAOC 6300 

TTTTGGCATG AAACAAAATC ACACAGAAGT TGTTTCCTTT TTTTAAGATC CCACACAAAA 6360 

GAGCTTTGAT CTAAGGAATC TGGATCCCTG ACAGGTTTAT CAAAATCCTT TGTGTTTTTC 6420 

TTAAAACTGA ATATTCCTCC AGCTTCTAGT ATTGATGTAA TATTCAATCT GTTTAGCAAG 6480 

TGAACACCTT GGTTCTTGTT GTTACTGTAC CCCCCCCCCC CCCCCCCCCC CGAGGCCCAG 6540 

ATTACCACGA CATOAATACA AGAATATTGA ACCCAGATCT AGAGTTTGTT TGTACTGTTG 6600 

AAAATOGGTG ACAATTCATT TTGTTATTGC GCTTTCTOAT AAOGACAGGA CTCCGTGATG 6660 

ATGGGAGCGG ACACCTATGA AACTGAAGAA GAAGCTTCAA AGCTACTGTT AGCTGGGAAG 6720 

GTCCCAGTTG GAATAGGAAG GAACACAAAG ATAAGGTGAG TATGOATGTG GAACCACCGG 6780 

TTAGTTCCCA AAAATATCAC TCACTGATAC CTGATGGTAT CCTCTGATTA TTTTCAGGAA 6840 

CTGTATCATT GACATGAATG CTAGGATTGG GAAGAACGTG GTGATCACAA ACAGTAAGGT 6900 

GAGCGAGCGC ACCTACATGG GTGCAGAATC TTGTGTGCTC ATCTATCCTA ATTCGGTAAT 6960 

TCCTATCCAG CGCTAGTCTT GTGACCATGG GGCATGGGTT CGACTCTGTG ACAGGGCATC 7020 

CAAGAGGCTG ATCACCCGGA AGAAGGGTAC TCGTACTACA TAAGGTCTGG AATCGTGGTG 7080 

ATCTTGAAGA ATGCAACCAT CAACGATGGG TCTGTCATAT AGATCGGCTG CGTGTGCGTC 7140 

TACAAAACAA GAACCTACAA TGGTATTGCA TCGATGGATC GTGTAACCTT GGTATGGTAA 7200 

GAGCCGCTTG ACAGAAAGTC GAGCGTTCGG GCAAGATGCG TAGTCTGGCA TGCTGTTCCT 7260 

TGACCATTTG TGCTGCTAGT ATGTACTGTT ATAAGCTGCC CTAGAAGTTG CAGCAAACCT 7320 

TTTTATGAAC CTTTGTATTT CCATTACCTG CTTTGGATCA ACTATATCTG TCATCCTATA 7380 

TATTACTAAA TTTTTACGTG TTTTTCTAAT TCGGTGCTGC TTTTGGGATC TGGCTTCGAT 7440 

GACCGCTCGA CCCTGGGCCA TTGGTTCAGC TCTGTTCCTT AGAGCAACTC CAAGGAGTCC 7500 

TAAATTTTGT ATTAGATACG AAGGACTTCA GCCGTGTATG TCGTCCTCAC CAAACGCTCT 7560 

TTTTGCATAG TGCAGGGCTT GTAGACTTGT AGCCCTTGTT TAAAGAGGAA TTTGAATATC 7620 

AAATTATAAG TATTAAATAT ATATTTAATT AGGTTAACAA ATTTGGCTCG TTTTTAGTCT 7680 

TTATTTATGT AATTAGTTTT AAAAATAGAC CTATATTTCA ATACGAAATA TCATTAACAT 7740 
OGATA 



(2) INFORMATION FOR SBQ ID NO: 2: 

<i) SEQUENCE CHARACTERISTICS 2 

(A) LENGTH: 1919 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND ED NESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO:2x 

ACAAGATCAC TTCGGGAGGC AAGTGCCATT TTGATCTTGC AGCCACCTTT TTTTGTTCTG 60 

TTGTGTATCT AGTAGTTGGA GGAGATATGC AGTTTGCACT TGCATTGGAC ACGAACTCAG 120 

GTCCTCACCA GATAAGATCT TGTGAGGGTG ATGGGATTGA CAGGTTGGAA AAATTAAGTA 180 

TTGGGGGCAG AAAGCAGGAG AAAGCTTTGA GAAATAGGTG CTTTGGTGGT AGAGTTGCTG 240 

CAACTACACA ATGTATTCTT ACCTCAGATG CTTGTCCTGA AACTCTTCAT TCTCAAACAC 300 

AGTCCTCTAG GAAAAATTAT GCTGATGCAA ACCGTGTATC TGCGATCATT TTGGGCGGAG 360 

GCACTGGATC TCAGCTCTTT CCTCTGACAA GCACAAGAGC TACGCCTGCT GTACCTGTTG 420 

GAGGATGTTA CAGGCTTATT GATATCCCTA TOAGTAACTG CTTCAACAGT GGTATAAATA 480 

AGATATTTGT GATGAGTCAG TTCAATTCTA CTTCGCTTAA CCGCCATATT CATCGTACAT 540 

ACCTTGAAGG CGGGATCAAC TTTGCTGATG GATCTGTACA GGTATTAGCG GCTACACAAA 600 

TGCCTGAAGA GCCAGCTGGA TGGTTCCAGG GTACAGCAGA CTCTATCAGA AAATTTATCT 660 

GGGTACT^A^ATTATTAC AGTCACAAAT CCATTGACAA CATTGTAATC TTGAGTGGCG 720 

ATCAGCTTTA TCGGATGAAT TACATGGAAC TTGTGCAGAA ACATGTCGAG GACGATGCTG 780 

ATATCACTAT ATCATGTGCT CCTGTTGATG AGAGCCGAGC TTCTAAAAAT GGGCTAGTGA 840 

AGATTGATCA TACTGGACGT GTACTTCAAT TCTTTGAAAA ACCAAAGGGT GCTGATTTGA 900 

ATTCTATGAG AGTTGAGACC AACTTCCTGA GCTATGCTAT AGATGATGCA CAGAAATATC 960 

CATACCTTGC ATCAATGGGC ATTTATOTCT TCAAGAAAGA TGCACTTTTA GACCTTCTCA 1020 

AGTCAAAATA TACTCAATTA CATGACTTTG GATCTGAAAT CCTCCCAAGA GCTGTACTAG 1080 

ATCATAGTGT GCAGGCATGC ATTTTTACGG GCTATTGGGA GGATGTTGGA ACAATCAAAT 1140 

CATTCTTTGA TGCAAACTTG GCCCTCACTG AGCAGCCTTC CAAGTTTGAT TTTTACGATC 1200 

CAAAAACACC TTTCTTCACT GCACCCCGAT GCTTGCCTCC GACGCAATTG GACAAGTGCA 1260 

AGATGAAATA TGCATTTATC TCAGATGGTT GCTTACTGAG AGAATGCAAC ATCGAGCATT 1320 

CTGTGATTGG AGTCTGCTCA CGTGTCAGCT CTGGATGTGA ACTCAAGGAC TCCGTGATGA 1380 

TGGGAGCGGA CATCTATGAA ACTGAAGAAG AAGCTTCAAA GCTACTGTTA GCTGGGAAGG 1440 

TCCCGATTGG AATAGGAAGG AACACAAAGA TAAGGAACTG TATCATTGAC ATGAATGCTA 1500 

GGATTGGGAA GAACGTGGTG ATCACAAACA GTAAGGGCAT CCAAGAGGCT GATCACCCGG 1560 

AAGAAGGGTA CTCGTACTAC ATAAGGTCTG GAATCGTGGT GATCCTGAAG AATGCAACCA 1620 

TCAACGATGG GTCTGTCATA TAGATCGGCT GCGTTTGCGT CTACAAAACA AGAACCTACA 1680 

ATGGTATTGC ATCGATGGAT CGTGTAACCT TGGTATGGTA AGAGCCOCTT GACAGGAAGT 1740 

CGAGCTTCGG GCGAAGATGC TAGTCTGGCA TGCTGTTCCT TGACCATTTG TGCTGCTAGT 1800 

ATGTACCTGT TATAAGCTGC CCTAGAAGTT GCAGCAAACC TTTTTATGAA CCTTTGTATT 1860 

TCCATTACCC TGCTTTGGAT CAACTATATC TGTCAGTCCT ATATATTACT AAATTTTTA 1919 
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) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS t 

(A) LENGTH: 518 amino acids 

(B) TYPE t amino acid 

(C) STRANDED NESS i single 

(D) TOPOLOGY i linear 

(ii) MOLECULE TYPE; protein 



(xi) SEQUENCE DESCRIPTION r SEQ ID NO: 3: 

Met Gin Phe Ala Leu Ala Leu Asp Thr Asn Ser Gly Pro His Gin He 
1 5 10 15 

Ara Ser Cys Glu Gly Asp Gly He Asp Arg Leu Olu Lys Leu Ser He 

20 25 30 

Gly Gly Arg Lys Gin Glu Lys Ala Leu Arg Asn Arg Cys Phe Gly Gly 
35 40 45 

Arg Val Ala Ala Thr Thr Gin Cys He Leu Thr Ser Asp Ala Cys Pro 
50 55 60 

Glu Thr Leu His Ser Gin Thr Gin Ser Ser Arg Lys Asn Tyr Ala Asp 
65 70 75 80 

Ala Aen Arg Val Ser Ala He He Leu Gly Gly Gly Thr Gly Ser Gin 
85 90 95 

Leu Phe Pro Leu Thr Ser Thr Arg Ala Thr Pro Ala Val Pro Val Gly 
100 105 HO 

Gly Cys Tyr Arg Leu He Asp He Pro Met Ser Asn Cys Phe Asn Ser 
115 120 125 

Gly He Asn Lys He Phe Val Met Ser Gin Phe Asn Ser Thr Ser Leu 
130 135 140 

Asn Arg His He HiB Arg Thr Tyr Leu Glu Gly Gly He Aen Phe Ala 
145 150 155 160 

Asp Gly Ser Val Gin Val Leu Ala Ala Thr Gin Met Pro Glu Glu Pro 
165 170 175 

Ala Gly Trp Phe Gin Gly Thr Ala Asp Ser He Arg Lys Phe He Trp 
180 185 190 

Val Leu Glu Asp Tyr Tyr Ser His Lys Ser He Asp Asn He Val He 
195 200 205 

Leu Ser Gly Asp Gin Leu Tyr Arg Met Asn Tyr Met Glu Leu Val Gin 
210 215 220 

Lys His Val Glu Asp Asp Ala Asp He Thr He Ser Cys Ala Pro Val 
225 230 235 240 

Asp Glu Ser Arg Ala Ser Lys Asn Gly Leu Val Lys He Asp His Thr 
245 250 255 

Gly Arg Val Leu Gin Phe Phe Glu Lys Pro Lys Gly Ala Asp Leu Aan 
260 265 270 

Ser Met Arg Val Glu Thr Asn Phe Leu Ser Tyr Ala He Asp Asp Ala 
275 280 285 
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Gin Lya Tyr Pro Tyr Lou Ala Sor Mot Gly He Tyr Val Phe Lye Lye 
290 2^5 300 

Asd Ala Leu Leu Asp Leu Leu Lye Ser Lye Tyr Thr Gin Leu Hie Aap 
305 310 315 320 

Phe Gly Ser Glu He Leu Pro Arg Ala Val Leu Asp His Ser Val Gin 
325 330 335 

Ala Cvb He Phe Thr Gly Tyr Trp Glu Aap Val Gly Thr lie Lye Ser 
* 340 345 350 

Phe Phe Aep Ala Aon Leu Ala Leu Thr Glu Gin Pro Ser Lya Phe Aap 
355 360 365 

Phe Tyr Aep Pro Lya Thr Pro Phe Phe Thr Ala Pro Arg Cya Leu Pro 
370 375 380 

Pro Thr Gin Leu Aap Lya Cya Lya Met Lya Tyr Ala Phe Ho Ser Aap 
385 390 395 400 

Gly Cya Leu Leu Arg Glu Cya Aan He Glu His Ser Val He Gly Val 
1 405 410 415 

Cva Ser Arg Val Ser Ser Gly Cya Glu Leu Lya Aap Ser Val Met Met 
1 420 425 430 

Gly Ala Aap He Tyr Glu Thr Glu Glu Glu Ala Ser Lya Leu Leu Leu 
435 440 445 

Ala Gly Lya Val Pro He Gly He Gly Arg Aan Thr Lya He Arg Aan 
450 455 460 

Cya He He Aap Met Aan Ala Arg He Gly Lya Aan Val Val He Thr 
465 470 475 480 

Aan Ser Lya Gly He Gin Glu Ala Aap Hie Pro Glu Glu Gly Tyr Ser 
485 490 495 

Tvr Tyr He Arg Ser Gly He Val Val He Leu Lya Aan Ala Thr He 
500 505 510 

Aan Aap Gly Ser Val He 
515 

(2) INFORMATION FOR SEQ ID NOi4t 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1551 baae paira 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : a ingle 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

<xl) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

ATGCAGTTTG CACTTGCATT GGACACGAAC TCAGGTCCTC ACCAGATAAG ATCTTGTGAG 60 

GGTGATGGGA TTGACAGGTT GGAAAAATTA AGTATTGGGG GCAGAAAGCA GGAGAAAGCT 120 

TTGAGAAATA GGTGCTTTGG TGGTAGAGTT GCTGCAACTA CACAATGTAT TCTTACCTCA 180 

GATGCTTGTC CTGAAACTCT TCATTCTCAA ACACAGTCCT CTAGGAAAAA TTATGCTGAT 240 

GCAAACCGTG TATCTGCGAT CATTTTGGGC GGAGGCACTG GATCTCAGCT CTTTCCTCTG 300 
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ACAAGCACAA GAGCTACGCC TGCTCTACCT GTTGGAGGAT GTTACAGGCT TATTGATATC 360 

CCTATGAGTA ACTGCTTCAA CAGTGGTATA AATAAGATAT TTGTGATGAG TCAGTTCAAT 420 

TCTACTTCGC TTAACCGCCA TATTCATCGT ACATACCTTG AAGGCGGGAT CAACTTTGCT 480 

GATGGATCTG TACAGGTATT AGCGGCTACA CAAATGCCTG AAGAGCCAGC TGGATGGTTC 540 

CAGGGTACAG CAGACTCTAT CAGAAAATTT ATCTGGGTAC TCGAGGATTA TTACAGTCAC 600 

AAATCCATTG ACAACATTGT AATCTTGAGT GGCGATCAGC TTTATCGGAT GAATTACATG 660 

GAACTTGTGC AGAAACATGT CGAGGACGAT GCTOATATCA CTATATCATG TGCTCCTGTT 720 

GATGAGAGCC GAGCTTCTAA AAATGGGCTA GTGAAGATTG ATCATACTGG ACGTGTACTT 780 

CAATTCTTTG AAAAACCAAA GGGTGCTGAT TTGAATTCTA TGAGAGTTGA GACCAACTTC 840 

CTGAGCTATG CTATAGATGA TGCACAGAAA TATCCATACC TTGCATCAAT GGGCATTTAT 900 

GTCTTCAAGA AAGATGCACT TTTAGACCTT CTCAAGTCAA AATATACTCA ATTACATGAC 960 

TTTGGATCTG AAATCCTCCC AAGAGCTGTA CTAGATGATA GTGTGCAGGC ATGCATTTTT 1020 

ACGGGCTATT GGGAGGATGT TGGAACAATC AAATCATTCT TTGATGCAAA CTTGGCCCTC 1080 

ACTGAGCAGC CTTCCAAGTT TGATTTTTAC GATCCAAAAA CACCTTTCTT CACTGCACCC 1140 

CGATGCTTGC CTCCGACGCA ATTGGACAAG TGCAAGATGA AATATGCATT TATCTCAGAT 1200 

GGTTGCTTAC TGAGAGAATG CAACATCGAG CATTCTGTGA TTGGAGTCTG CTCACGTGTC 1260 

AGCTCTGGAT GTGAACTCAA GGACTCCGTG ATGATGGGAG CGGACATCTA TGAAACTGAA 1320 

GAAGAAGCTT CAAAGCTACT GTTAGCTGGG AAGGTCCCGA TTGGAATAGG AAGGAACACA 1380 

AAGATAAGGA ACTGTATCAT TGACATGAAT GCTAGGATTG OGAAGAACGT GGTGATCACA 1440 

AACAGTAAGG GCATCCAAGA GGCTGATCAC CCGGAAGAAG GGTCCTACTA CATAAGGTCT 1500 

GGAATCGTGG TGATCCTGAA GAATGCAACC ATCAACGATG GGTCTGTCAT A 1551 

(2) INFORMATION FOR SEQ ID NO: 5: 

<i) SEQUENCE CHARACTERISTICS x 

(A) LENGTHS 517 amino acids 

(B) TYPE i amino acid 

(C) STRANDEDNBSS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION i SEQ ID NO: 5: 

Met Gin Phe Ala Leu Ala Leu Aap Thr Aon Ser Gly Pro Hie Gin He 
x 5 10 15 

Arc Ser Cya Glu Gly A Bp Gly He Asp Arg Leu Glu Lya Leu Ser He 
20 25 30 

Glv Gly Arg Lys Gin Glu Lys Ala Leu Arg Asn Arg Cys Phe Gly Gly 
35 40 45 

Aro Val Ala Ala Thr Thr Gin Cya He Leu Thr Ser Aap Ala Cya Pro 
50 55 60 
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Olu Thr Leu His Ser Gin Thr Gin Sar Ser Arg Lye Aen Tyr Ala Aep 
65 70 75 

Ala Aen Arg Val Ser Ala lie II Leu Gly Gly Gly Thr Gly Ser Gin 
85 ^0 « 

Leu Phe Pro Leu Thr Ser Thr Arg Ala Thr Pro Ala Val Pro Val Gly 
100 105 110 



Glv Cys Tyr Arg Leu He Asp He Pro Met Ser Aen Cys Phe Asn Ser 
115 1on 

He Asn Lys He Phe 

130 "5 140 



115 120 125 

Gly He Asn Lys He Phe Val Met Ser Gin Phe Asn Ser Thr Ser Leu 



Asn Arg His He His Arg Thr Tyr Leu Glu Gly Gly He Asn Phe Ala 
145 y 150 155 160 

Asp Gly Ser Val Gin Val Leu Ala Ala Thr Gin Met Pro Glu Glu Pro 
* 165 170 175 

Ala Gly Trp Phe Gin Gly Thr Ala Asp Ser He Arg Lys Phe He Trp 
1B0 185 19U 

Val Leu Glu Asp Tyr Tyr Ser His Lys Ser He Asp Asn He Val He 
195 200 205 

Leu Ser Gly Asp Gin Leu Tyr Arg Met Asn Tyr Met Glu Leu Val Gin 
210 215 220 

Lys His Val Glu Asp Asp Ala Asp He Thr He Ser Cys Ala Pro Val 
225 230 235 240 

Asp Glu Ser Arg Ala Ser Lys Asn Gly Leu Val Lys He Asp His Thr 

Glv Ara Val Leu Gin Phe Phe Glu Lys Pro Lys Gly Ala Asp Leu Asn 
y * 260 265 270 

Ser Met Arg Val Glu Thr Asn Phe Leu Ser Tyr Ala He Asp Asp Ala 
275 280 285 

Gin Lys Tyr Pro Tyr Leu Ala Ser Met Gly He Tyr Val Phe Lys Lys 
290 295 300 

Asp Ala Leu Leu Asp Leu Leu Lys Ser Lys Tyr Thr Gin Leu His Asp 
305 310 315 320 

Phe Gly Ser Glu He Leu Pro Arg Ala Val Leu Asp His Ser Val Gin 
325 330 335 

Ala cys He Phe Thr Gly Tyr Trp Glu Asp Val Gly Thr lie Lys Ser 
2 34O 345 350 

Phe Phe Asp Ala Aan Leu Ala Leu Thr Glu Gin Pro Ser Lys Phe Asp 
355 360 365 

Phe Tyr Asp Pro Lys Thr Pro Phe Phe Thr Ala Pro Arg Cys Leu Pro 
37O 375 380 

Pro Thr Gin Leu Asp Lys Cys Lys Met Lys Tyr Ala Phe He Ser Asp 
385 390 395 400 

Gly Cys Leu Leu Arg Glu Cys Asn He Glu His Ser Val He Gly Val 
405 410 415 
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Cya Ser Arg Val Ser Ser Gly Cys Glu Leu Lys Aep Ser Val Met Ket 
420 425 430 

Glv Ala Asp He Tyr Glu Thr Glu Glu Glu Ala Ser Lye Leu Leu Leu 
1 435 440 445 

Ala Gly Lys Val Pro He Gly He Gly Arg Asn Thr Lye He Arg Aen 
450 455 460 

Cye He He Aep Met Aen Ala Arg He Gly LyB Aen Val Val He Thr 
465 470 475 480 

Aen Ser Lye Gly He Gin Glu Ala Aep Hie Pro Glu Glu Gly Ser Tyr 
465 490 495 

Tvr He Arg Ser Gly He Val Val He Leu Lye Aen Ala Thr He Aen 
500 505 510 

Aep Gly Ser Val He 
515 
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qaims 

1 1. A polynucleotide molecule, comprising a variant of the wild type shrunken-2 (SH2) 

2 gene, wherein said variant codes for the insertion of at least one additional amino acid within or 

3 close to the aUosteric binding site of the ADP-glucose pyropbosphorylase (AGP) enzyme subunit, 

4 whereby a plant expressing said polynucleotide molecule has increased seed weight relative to the 

5 seed weight of a plant expressing the wild type Sh2 gene. 

1 I The polynucleotide molecule, according to claim 1, wherein said polynucleotide 

2 molecule encodes at least one serine residue inserted between amino acids 494 and 495 of the 

3 native AGP enzyme subuniu 

1 3. The polynucleotide molecule, according to claim 1, wherein said polynucleotide 

2 molecule encodes the amino add pair tyrosmSerine, wherein said amino^d^!^^ 

3 between amino acids 494 and 495 of the native AGP enzyme subunit 

1 4. The polynucleotide molecule, according to claim 1, wherein said polynucleotide 

2 molecule encodes the amino acid pair serineityrosine, wherein said amino acid pair is inserted 

3 between amino acids 495 and 496 of the native AGP enzyme subunit 

1 5. The polynucleotide molecule, according to claim 1, wherein the AGP enzyme encoded 

2 by said polynucleotide molecule consists essentially of an amino acid sequence selected from the 

3 group consisting of SEQ ID NO. 5 and SEQ ID NO. 3. 

1 6. The polynucleotide molecule, according to claim 5, wherein the nucleotide sequence 

2 encoding SEQ ID NO. 3 comprises nucleotides 87 through 1640 of the sequence shown in SEQ 

3 ID NO. 2 or a degenerate fragment thereof. 

1 7. A method for increasing the seed weight of a plant, comprising incorporating the 

2 polynucleotide molecule of claim 1 into the genome of said plant and expressing the protein 

3 encoded by said polynucleotide molecule. 

1 8. The method, according to claim 7, wherein said plant is Zea mays. 



1 
2 



9. A plant seed comprising the polynucleotide molecule of claim 1 within the genome 
of said seed. 
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A plant expressing the polynucleotide molecule of claim 1. 
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11. The plant, according to claim 10, wherein said plant is Zea mays. 

12. The plant, according to claim 10, wherein said plant is grown from the seed of claim 

9. 

13. A variant ADP-glucose pyrophosphorylase (AGP) protein, wherein said protein has 
at least one additional amino add inserted within or close to the allosteric binding site of the 
wild-type AGP protein. 

14. The variant AGP protein, according to claim 13, wherein said protein has at least one 
serine residue inserted between amino adds 494 and 495 of the wild type AGP protein sequence. 

15. The variant AGP protein, according to claim 11, wherein said protein has the amino 
add pair tyrosineserine inserted between amino adds 494 and 495 of the wild-type AGP protein 
sequence. 

16. The variant AGP protein, according to claim 11, wherein said protein has the amino 
add pair serine:tyrosine inserted between amino adds 495 and 496 of the wild-type AGP protein 
sequence. 

17. The variant AGP protein, according to daim 13, wherein said protein consists 
essentially of an amino add sequence selected from the group consisting of SEQ ID NO. 5 and 
SEQ ID NO. 3. 

18. The variant AGP protein, according to claim 13, wherein said protein is expressed 
in the endosperm of a plant during seed development 
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